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An analytic model for the gravitational clustering 
of dark matter haloes 



ABSTRACT 

We develop a simple analytic model for the gravitational clustering of dark haloes. The 
statistical properties of dark haloes are determined from the initial density field (assumed 
to be Gaussian) through an extension of the Press-Schechter formalism. Gravitational 
clustering is treated by a spherical model which describes the concentration of dark haloes 
in collapsing regions. We test this model against results from a variety of N-body simula- 
tions. The autocorrelation function of dark haloes in such simulations depends significantly 
on how haloes are identified. Our predictions agree well with results based on algorithms 
which break clusters into subgroups more efficiently than the standard friends-of-friends al- 
gorithm. The agreement is better than that found by assuming haloes to lie at the present 
positions of peaks of the linear density field. We use these techniques to study how the 
distribution of haloes is biased with respect to that of the mass. The initial (Lagrangian) 
positions of haloes identified at a given redshift and having circular velocities Vc = v*(z) 
(i.e. mass equal to the characteristic nonlinear mass M* at that redshift) are very weakly 
correlated with the linear density field or among themselves. As a result of dynamical 
evolution, however, the present-day correlations of these haloes are similar to those of the 
mass. Haloes with lower Vc are biased toward regions with negative overdensity, while those 
with higher Vc are biased toward regions with positive overdensity. Among the haloes iden- 
tified at any given epoch, those with higher circular velocities are more strongly correlated 
today. Among the haloes of given circular velocity, those at higher redshifts are also more 
strongly clustered today. In the "standard CDM" model, haloes with Vc = 200kms~"^ and 
identified at redshift z ^ 2 have present-day autocorrelation comparable to that of normal 
galaxies in the real universe. 
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1 INTRODUCTION 



It is generally believed that the large scale structures in the universe formed through 
the growth of small inhomogeneities by gravitational instability. In the hierarchical cluster- 
ing scenario of structure formation, a dominant dissipationless component of dark matter 
is assumed to aggregate into dark matter clumps, the virialized parts of which are usually 
called dark haloes. Galaxies then form by the cooling and condensation of gas within 
these dark haloes (White and Rees 1978). Non-gravitational effects, which are difficult to 
model, are likely to be critical in galaxy formation, yet have little effect on the formation 
and clustering of dark haloes. The study of the formation and clustering of haloes is there- 
fore less ambiguous than that of galaxies, but is nevertheless important, because of the 
close relation between galaxies and haloes. 

Dark haloes are highly nonlinear objects. Their evolution is usually studied by nu- 
merical simulations (e.g. Frenk 1991; Gelb & Bertschinger 1994a, b and references therein). 
Such simulations are limited both in resolution and in dynamical range and can be difficult 
to interpret. Our understanding of their results can be substantially enhanced by simple 
physical models and by analytic approximations. Also a simple and successful analytic 
model can be used to carry out a large parameter study without requiring a large amount 
of computer time. The present paper attempts to provide such a model. 

The initial distribution of density fluctuations in the universe is usually assumed to 
be Gaussian, and so to be described completely by its power spectrum. This, in turn, 
is derived from a model for the origin of structure. It is perhaps feasible to associate 
dark haloes or galaxies with special regions of the initial density fleld and to consider the 
clustering of these objects which results both from initial conditions and from motions due 
to gravitational interaction. Kaiser (1984) used this idea to explain the strong clustering 
of Abell clusters as a consequence of the statistical properties of high peaks in an initial 
Gaussian fleld. His formalism was developed extensively by Bardeen et al. (1986, hereafter 
BBKS). BBKS found that if galaxies formed at high peaks, they should be more clustered 
than the mass, and that if galaxies of different types are associated with peaks of different 
heights, they should have different clustering properties. Galaxies are then biased tracers 
of the mass. However, it is not known how well galaxies correspond to high peaks of 
the initial fleld, and there is some direct evidence that the correspondance of such peaks 
with dark haloes is not particularly good (Frenk et al. 1988; Katz, Quinn & Gelb 1993). 
Furthermore the clustering of peaks in Eulerian space may differ substantially from that in 
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the initial (Lagrangian) space and it is unclear how to deal with the problem that a single 
dark halo may contain several galaxies. Press & Schechter (1974, hereafter PS) developed 
a formalism which identifies haloes at a given cosmic time as regions (in the initial density 
field) which just collapse at that time according to a spherical infall model. Both the 
halo mass function they derived and the detailed structure of hierarchical clustering which 
their theory predicts have gained considerably in plausibility from recent theoretical work 
(e.g. Bond et al. 1991; Bower 1991; Lacey & Cole 1993) and from tests against N- 
body simulations (Efstathiou et al. 1988, hereafter EFWD; Kauffmann & White 1993; 
Lacey & Cole 1994). Unfortunately, these various tests show that although the statistical 
predictions of the theory work well, the basic hypothesis on which it is based works very 
poorly on an object by object basis (see Bond et al. 1991; White 1995). 

The PS theory developed in the above papers does not provide a model for the spatial 
distribution of dark haloes. As we show below, it can, however, be extended to construct 
such a model. Dark haloes are defined using the initial density field as in the PS formal- 
ism. Their gravitational clustering is treated by a spherical model which describes their 
concentration in dense regions through their relation to the mass distribution in the initial 
density field. This model is tested by comparing its prediction for the two-point correlation 
functions of haloes with those derived from N-body simulations. We describe our model 
in Section 2 and compare its predictions with N-body simulations in Section 3. Section 4 
discusses how spatial distributions of dark haloes of different types are biased with respect 
to the mass distribution. Section 5 summarizes our main conclusions. Since the analytic 
argument of the paper is quite conplex, we provide an Appendix which summarizes how 
our formulae should be used for the calculation of various correlation functions, for ex- 
ample the auto- and cross-correlations of haloes of differing circular velocities, and the 
present-day correlation properties of objects which are identified as individual haloes at 
high redshifts. 

2 THE MODEL 

Although the model described here may readily be extended to other cosmologies, 
we assume, for simplicity, an Einstein-de Sitter universe (i.e. that the total mass density 
parameter = 1, and the cosmological constant A = 0). All physical length scales are 
quoted either assuming a Hubble constant H = 50 km s~"^Mpc~"^ or in units of /i~"^Mpc 
where h = if/[100 km s'^Mpc"^]. 



4 



2.1 The statistics of initial density field 



We assume that the initial overdensity field, S(x.) = [/3(x) — p]/p (whose Fourier 
transform is denoted by (5k), is Gaussian and is described by a power spectrum P{k) = 
For most of our discussions, we will take the standard CDM model (Blumenthal 
et al. 1984; Davis et al. 1985) as an example. In this model the power spectrum of the 
mass density fiuctuations is given by equation (G3) of BBKS: 

P{k) = AkT^k); (1) 

T{k) = ^ ^ [1 + 3.89/€ + (16. Ik)^ + (5.46/€)^ + (G.TIk)^] ' 

where k = k /[{Qh^)Mpc~^]. We take O = 1 and /i = 0.5. A is a normalization factor to 
be specified below. We will also consider scale-free models with power-law spectra: 

P{k) = Ak'^. (2) 

The field (5(x) can be smoothed by convolving it with a spherical symmetric window 
function VF(i?o; r) having comoving radius Rq (measured in current units). The smoothed 
field is 

S{Ro;^) = J W{Ro;\^-y\)S{y)dy 

= J^(5kW"(i?o;A;)exp(zk-x), (3) 
k 

where W{Ro,k) is the Fourier transform of the window function W{Ro', r). Therefore, 
for each point x, there is a trajectory Sq = S(Ro) which describes the overdensity S as a 
function of the window radius Rq . A useful quantity characterising the power spectrum is 
the rms fiuctuation of mass in a given smoothing window: 

A\Ro) = mRo;^)\') = Y,PikW'{Ro;k). (4) 

k 

A convenient way to normalize the power spectrum [i.e. to determine A in equations (1) 
and (2)] is to specify A(i?o) at a given radius for a given window function. We write 
A(8 /i~"^Mpc) = CTg for a top-hat window: W{Ro;r) = 1 if r < i?o and otherwise. This 
normalization is related to the conventional "bias factor" b through erg = 1/6. 
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For a given window function the smoothed field S(Ro;x.) is Gaussian and obeys the 
foUowing distribution function 

p{uo)duo = j^^^:^exp~''°/^duo, (5) 

where = Sq / A(Ro). Since both Sq and A grow with time in the same manner in hnear 
perturbation theory, it is convenient to use Sq and A hnearly extrapolated to present time. 
It is clear that the extrapolated quantities still obey equation (5). In what follows, we 
write our formula in terms of the extrapolated quantities, unless otherwise stated. Also we 
will omit writing explicitly the smoothing radius i?o, but we will often use subscripts to 
distinguish A, and other quantities, at different smoothing lengths [e.g. Aq = A(i?o), Ai = 
A(i?i)]. For a top-hat window function, which is adopted for almost all our discussion, 
the average mass M(Ro) contained in a window of radius Rq is Mq = (47r/3)/9o-Ro, where 
po is the mean density of the universe. For a given spectrum, the quantities i?o, Aq and 
Mo are equivalent variables. 

The two-point correlation function of mass in Lagrangian space, i^^, can now be readily 
derived from the statistics of the Gaussian density field. The average mass correlation 
function which is related to the two-point correlation by 

el(i?o) ^ ^ / ATrr'drtir) (6) 

^0 JVo 

with Vo = 47ri?g/3, can be formally written as 

Ci(Ro) = 4f I Mop(Ao,6o\m)d6o - 1 (7a) 
^wfo J 

where Mq = (1 -|- (5o)Mo, and p(Ao , i^o |"^), which we abbreviate as p(0|m) hereafter, is 
the conditional probability for a spherical region with comoving radius Rq to have a mean 
linear overdensity (5o, given that there is a mass particle in its central volume element. 
According to Bayes' theorem, we write p(0|m) oc p(0)p(m|0) where p(0) is an abbreviation 
of p(i^o) given by equation (5); p(m|0) is the probability of finding a mass particle in 
the central volume element of a sphere with radius Rq and overdensity Sq. We expect 
that p(m|0) is proportional to the particle number density at the center of the sphere, or 
approximately p(m|0) oc Mq/Rq. ^{Rq) can then be written as 

,L.„ 1 lM^p{0)dSo 

^-^""'^ - Wo J Mop{0)dSo ^''^ 
6 



In the present case Mo = Mo(l + (5o), and ^(i?o) = / S^p{0)dSo = A^{Ro). This resuh 
is trivial, but the procedure is quite suggestive: if a quantity (in the above example, the 
mass) is a function of the linear density Sq in a given window, it is possible to calculate 
the spatial correlation function of this quantity. In the next subsection, we will see that 
the number of dark haloes in a region is related to the linear overdensity of that region. 
We can therefore hope to obtain an analytical model for the correlation functions of dark 
haloes in Lagrangian space. 



2.2 Dark haloes from the initial density field 



We assume that dark haloes are spherical symmetric, virialized clumps of dark matter. 
In an Einstein-de Sitter universe, the physical radius i? of a spherically symmetric pertur- 
bation with comoving initial radius Rq and (extrapolated) mean interior density contrast 
So > evolves with redshift z as 

R(z) 3 1 — cos 6 

So the radius of a perturbation will reach its maximum Rj^ at redshift Zj^^ with Rj^ and 
Zja given by equation (8) with 6 = tt. The perturbation collapses to a point (R = 0) at 
a redshift Zc = 1.686/(5o — 1. In practice, the perturbation will not, of course, collapse 
to a point, but will virialize at a radius -Rvir. It is usually assumed that a collapsing 
structure virializes at half its radius of maximum expansion. This would give a density 
contrast at the time of collapse of ~ 178. We model the density profile of dark haloes 
by that of a singular isothermal sphere. The mass M and circular velocity Vc [defined by 
Vc = (GM/ry/'^] of a halo are therefore related to its initial comoving radius Ri (note: in 
what follows the properties of dark haloes are labelled by subscripts 1, 2; the subscript 
is reserved for the properties of uncoUapsed spherical regions) and the redshift z at which 
it is found by 

M=^poRl; v, = im{l + zfl^HR^, (9) 

where the latter equation assumes that the mean density contrast of clumps when they 
virialize is 178 and that the haloes found at redshift z have all just virialized. 
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The probability for a spherical region to be part of a single collapsed structure (i.e. 
for its overdensity S to exceed Sc = 1.686) by redshift z is 



F{Ri,z) 



p{Ri, z, S)dS, 



(10) 



where p(Ri^z^S) is given by equation (5) with Rq and Sq replaced by Ri and 6(1 + z) 
respectively. According to PS, F{Ri , z) gives half the fraction of matter which is in haloes 
of radius exceeding Ri (or mass exceeding M, see equation 9 above) at redshift z (see also 
Bond et al. 1991, hereafter BCEK). The differential mass distribution is then 



f{M,z)dM 



OF 
-2— -dM 

dM 
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rexp 



8^ 



2A2 



dM. 



(11) 



(27r)i/2 A2 

Hence the comoving number density of haloes, expressed in current units, as a function of 
Vr and z is 

8i{i+zr 



-?,{im^)8cH\l+ zfl^ dlnA 



d Inu, 



-exp 



2A2(i?i) 



(12) 



(27^)3/2^;4A(i?l) 

(see White and Frenk, 1991 for details). Since equation (12) applies to haloes that have 
not been incorporated into larger collapsed systems at a given redshift, it accounts auto- 
matically for the cloud-in-cloud problem. 

We also need a formula to describe the relation between haloes and the surrounding 
density field. BCEK show that the probability that a randomly chosen spherical region 
(-Rc'^o) which is not contained in a collapsed object at redshift z is 

^('^0)^/^0 = TTr^TTT _ g-(.o-2..)V2l ^^^^ (13) 

(zvrj-^/^ L J 

where Vc = 8c{l + z)l The fraction of the mass in such regions of mass Mq (correspond- 
ing to rms overdensity Aq) and present extrapolated overdensity (5o, which at redshift zi 
[corresponding to extrapolated critical overdensity 8i = [1 -\- zi )8c\ is in haloes with mass 
in the range Mi — )• Mi -\- dMi (where Mi < Mq by definition) is 



fiAi,8i\Ao,8o)^^dMi 



1 



8i - 8o 



(27r)i/2 (A2 - A2)3/2 



exp 



(^i-^o)^ 
■2(A?-Ag) 



dA? 
dMi 



dMi. (14) 



(Bower 1991; BCEK). So the average number of Mi haloes at redshift zi in a spherical 
region with comoving radius Rq and overdensity 8o is 



Ar(l|0)4#dMi ^ ^/(l|0)4#dMi. 
^ ' UMi m/^ ' UMi 



(15) 



where /(1|0) = /(Ai , (5i | Aq, i^o)- Notice that since Mi is coUapsed at 2;i > whereas Mq 
is analysed at z < zi^ we have that Si > Sq. For haloes of a given mass Mi and redshift zi^ 
the distribution of this number is determined by the statistics of the background density 
field (Ao,<5o). 

2.3 Correlations of dark haloes in Lagrangian space 

Let us first consider the cross-correlation between dark haloes and mass. Denote by 
p(Ao, (^0 |Ai, (5i)c/(5o or simply p(0\l)dSo the conditional probability for a spherical region 
with comoving radius Rq to have a mean linear overdensity (5o, given that there is, at 
redshift zi = Si / Sc — 1^ a dark halo with mass Mi (corresponding to initial comoving 
radius Ri < Rq) at its centre. To be consistent with the definition that a halo should 
not be contained in a larger halo, the region (Rq^Sq) should also not be contained in a 
larger collapsed region. The average cross-correlation function between haloes and mass, 
at radius Rq in Lagrangian space, Ctmi^o), can formally be written as 

eL(i?o) = r^^'^'^ Sop{0\l)dSo, (16) 

J — oo 

where the integral limit is chosen so that the larger region has not collapsed by redshift 
zi. According to Bayes' theorem, we write 

KO|l)ocg(0)Kl|0) (17) 

where q(0) is given by equation (13); p(l|0) is the probability to find a halo of type 1 
(i.e. with corresponding initial redius Ri and identified at redshift zi) at the centre of 
a Lagrangian sphere with radius Rq and with overdensity Sq. We assume that p(l|0) is 
proportional to the number density of type 1 haloes at the center of the spherical region, 
and write p(l|0) oc J\f{l\0)/Rl. This is an approximation, because it assumes that the 
central number density of haloes is equal to the average number density. The function 
Chni(-Ro) can then be written as 

-L Soq{0)Af{l\0)dSo 

4m(^o) = • (18) 



It can be proven that the denominator in equation (18) is equal to n(vi^ zi)Vo^ where 
Vo = 47ri?o/3 and n(vi,zi) is the comoving density of type 1 haloes (equation 12). 
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The cross-correlation function between haloes of type 1 and those of type 2 in La- 
grangian space, ^2(^0), can be obtained in a similar way. Let A/'(2|0; 1) be the number of 
type 2 haloes in a spherical region with comoving radius i?o, given that this region has an 
overdensity Sq and a halo of type 1 in its central volume element. We can write 

I /-(i+^i)^ 



Ci2(Ro) = TTT / -^(210; l)p{0\l)dSo - 1 (19) 

Since haloes are spatially exclusive in our model, we assume that 

A/'(2|0;l) = ^°^^^V (A2,^2|Ao,^o). (20) 
Using the same argument as in deriving equation (18), we obtain 

^''^ n{v2.Z2)Vo /(^;-)^^Ar(l|0)g(0)rf^o ■ 

Since by definition p(0|l) = if (5i < (5o, the integral limits ensure that the large region 
has not collapsed at 2; > min(2;i, Z2). The ^2 defined has the desired property that 
C12 — C21 1 if ^0 is much larger than both Mi and M2 ■ 

2.4 Dynamical evolution of correlation functions 

The correlation functions defined by equations (7b), (18) and (21) are correlation 
functions in Lagrangian space. In these cases we have an ensemble of spheres with the 
same Lagrangian radius, and the correlation functions are estimated by the statistics of 
masses and halo numbers in these spheres. The correlation functions in physical (Eulerian) 
space should, however, be estimated from samples of spheres with the same physical radius. 
We therefore need to describe the dynamical evolution of the correlation functions. To treat 
the dynamical evolution accurately one needs numerical simulations. Here we propose an 
analytic model based on simple approximations which we discuss below. 

Within a spherical region with radius i?o in Lagrangian space, the number of haloes 
which have mass in the range Mi — )• Mi + dMi at redshift zi is given, via the mean 
overdensity of the region (5o, by equation (15). We assume that the mass shell {Rq^Sq) 
will contract or expand, depending on (5o > or < 0, according to the spherical evolution 
model of density perturbations in an Einstein-de Sitter universe. We also assume that as 
the mass shell contracts or expands the mass and the number of dark haloes within it do 
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not change. The conservation of mass is easy to understand, while the conservation of the 
number of dark haloes is justified by the fact that the definition of dark haloes involved in 
equations (12)-(15) takes account the disappearance of dark haloes due to merging. The 
evolution of the radius i? of a mass shell (-Ro, '^o) with (5o > is described by equation (8). 
For So < 0, we should replace (1 — cos 6) in equation (8a) by (ch^ — 1), (^ — sin 6) in equation 
(8b) by (sh^ — ^), and Sq by \So\. For a given redshift, a physical radius R corresponds to 
a curve in the (Rq^Sq) space. In Figure 1 we plot three (solid) curves which correspond 
to i? = 1, 10 and 100 Mpc at 2; = 0. Since the evolution equations depend on R through 
-R/-R0, curves corresponding to other values of R can be obtained by shifting the curves on 
the plot along the Rq axis. These curves represent the evolution of different mass shells 
before collapsing (i.e. Sq < Sc). Each point on the curves corresponds to a spherical mass 
shell which evolves into the corresponding physical radius at 2; = (or similarly at another 
redshift). Now suppose we have an ensemble of spherical regions with radius R in Eulerian 
space, each of which corresponds to a point in the Rq-Sq space. Since the mass and the 
number of haloes contained in each sphere are known, one can estimate the correlation 
functions at a physical radius R from the statistics of the mass and halo numbers. 

Let pe('^o |1, R)dSo be the probability of finding a spherical region with Eulerian radius 
R and with linear overdensity in the range Sq ^ Sq -\- dSo^ given that there is a type-1 halo 
at its centre. We can relate the Lagrangian radius Rq of this region to R and Sq through 
the spherical model of perturbation evolution. It is obvious that accuracy of the result 
will depend on how accurate it is to use the spherical model in describing the evolution 
of a mass shell (Rq^Sq). We may expect that the spherical model should work better for 
mass shells that have a larger mass halo at their centers. We therefore write the average 
cross-correlation functions between type 1 haloes and mass and between type-1 and type-2 
haloes in Eulerian space as 



respectively, where V = (Att /3)R^ . In equation (23), we assume, without loss of generality, 
that Ml > M2. The factor [{Ro/Rf - 1] in equation (22) gives the density contrast in 
Eulerian space. 




(22) 



and 




(23) 
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It is important to note that the probabihty pe('^o|1,-R) is defined for a halo at the 
centre of an Eulerian sphere. According to Bayes' theorem, we can write 

n r,. pe(iIo)K'^oI^) ... ^ 

J pE(l\0)p(do\R)ddo 

where p(So\R)dSo is the probabihty that a spherical region with Eulerian radius R has 
linear overdensity in the range Sq ^ Sq -\- dSo^ and pe(1|0) is the probability of finding 
a halo of type 1 at the centre of such a sphere. As before, we assume that pe(1|0) is 
proportional to the number density of type-1 haloes, but now in Eulerian space. Then we 
have 

PE(l|0)ocijAr(l|0). (246) 

Equation (24b) is, of course, only an approximation, because it assumes that the number 
density of haloes at the center of the spherical region is proportional to the average number 
density within the region. We can now write 

To derive an expression for p(So\R)^ we consider the corresponding cumulative function 
p(So > S\R). For the power spectra under consideration, the rms mass fiuctuation A(i?o) 
decreases monotonically with increasing Rq. The quantity = (5o/Ao, therefore, increases 
monotonically with Sq for a given R. Our problem now becomes to find p{vo > iy\R). We 
note that, for a given i?, goes from — oo to oo as (5o goes from —oo to Sc = 1.686. For 
a Gaussian field, obeys equation (5) for a given Lagrangian radius. We therefore make 
the following ansatz: 

p{uo > v\R) = / ^=e-^«/Mz/o. (26a) 



/27r 

This ansatz implies that, for given R and the function p(So\R) is given by 

K^o|i?) = ^e-^o/^l^, (266) 

where = Sq / A(Ro) depends on Sq both directly and through Rq = Ro(So^R) which is 
given by the spherical model of equation (8). 

The motivation for our ansatz is as follows. We recall that, for each point x in the 
Lagrangian space, the overdensity Sq in a window of Lagrangian radius Rq is described by 
the trajectory So(Ro) of a "particle" in Rq-Sq space (see Figure 1). All trajectories obey 
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the boundary condition that (5o — > when i?o — ^ oo. As shown in Figure 1, for a given 
Eulerian radius i?, each trajectory wiU pass through Rq = R as the "particle" moves from 
i?o = oo to i?o = 0. Each trajectory wiU also cross the boundary Sq = (5o(-Ro, -R), since the 
probability for a particle to go to (5o = — oo is zero. For a given i?o, the probability for 
a trajectory to cross the vertical line Rq = Rq above Sq = S is given exactly by equation 
(26a) with v = S/Aq. If each trajectory crossed the boundary Sq = So(Ro^R) only once, 
this would also be the probability for a trajectory to cross the boundary above Sq = 8, 
and the ansatz (26a) would be exact. However, a trajectory can cross a given boundary 
more than once in two different ways: it can either cross the boundary more than once 
at 8q < 8c, or cross the boundary first at 8q = 8c and then at 8q < 8c. These two cases 
are schematically shown in Figure 1 by short-dashed and dotted curves, respectively, for a 
point 'A' on the boundary 8o = 8o(Ro,R) with R = 10 Mpc. These kinds of trajectories 
should be excluded in calculating the correlation functions of dark haloes. In the first 
case, the contribution of point 'A' to the correlation function at Eulerian radius R = 10 
Mpc has already been taken into account at point 'C, because the trajectory implies that 
the region represented by 'A' is contained in that represented by 'C. In the second case, 
the region 'A' is part of a larger region (represented by 'B') which has collapsed at 2; = 
(because the overdensity within it has reached 80 = 8c) and no haloes can exist in such 
a region according to our definition of dark haloes. According to BCEK, the probability 
of finding a spherical region {Rq,8q) which is not a part of a larger collapsed region is 
given by equation (13). This suggests that it may be better to make our ansatz for p{8q \R) 
according to equation (13) instead of equation (5). As we will show later, both ansatze 
give quite similar results, and the ansatz based on equation (13) is not superior to that 
based on (5) for a reason we discuss below. 

To see how well our ansatz works, we generate a large number of trajectories in (i?o , f^o ) 
space (see Figure 1) starting from i?o = 1000 Mpc (the results do not change significantly 
if we start from a larger i?o) and 8q = 0. We calculate the fraction of such trajectories 
which first cross the boundary 8q = 8q{Rq,R) for a given R near 8q = 8. This fraction 
gives the probability p{8q\R). The trajectories are generated using the fact that, for a 
sharp A;-space window function \W{R{),k) = 1 if A; < l/i?o and W{R{),k) = otherwise], 
the change in (5o, D8q, due to a change of i?o from i?o to Ri = Rq — DRq, is a Markov 
random walk governed by a probability function 
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{D8oy 



2(A?-A^) 



(27) 



(see BCEK for a discussion). For a top-hat filter as used in the present paper, equation (27) 
is only an approximation since successive steps in the random walk are correlated (Bower 
1991). In Figure 2 we compare the results given by our ansatz (26) (solid curves) and that 
based on equation (13) (dashed curves) to those given by the Monte Carlo trajectories 
described above. The results shown are for a erg = 1 CDM spectrum and for two values of 
Eulerian radius, R = 10 and 2 Mpc, representing large and small scales. The figure shows 
that both models are quite accurate for i? > 10 Mpc and that they are indistiguishable 
on these scales (note: the larger R is, the more accurate the models are). The ansatz 
(26) holds also reasonably well on scales as small as 2 Mpc, but it tends to give excessive 
weight to regions with low linear densities. This is clearly due to fact that for a given R a 
lower value of Sq corresponds to a lower value of Rq and a larger value of Aq. A trajectory 
thus has a larger probability of crossing a given boundary Sq = (5o(-Ro, R) several times for 
smaller values of Rq. As a result the ansatz overestimates the probability of first crossing 
at lower value of Sq. This overestimation is more severe in the model based on equation 
(13), because this model does not allow multiple crossings, and so reduces the relative 
probability of high values of Sq. The accuracy of the ansatz may depend on the amplitude 
and shape of the power spectrum. For a power spectrum in which the rms fluctuation A 
on small scales is larger than that for the erg = 1 CDM spectrum, the ansatz works worse 
for small R. To get an idea of how strongly our results depend on the ansatz, we show in 
Figure 3 the correlation functions obtained by using model (26) (curves) and that based on 
equation (13) (crosses). The results are shown for haloes with different circular velocities 
(or masses, see equation 9) in a CDM model with erg = 1. The two models agree very 
well for most cases. Noticeable disagreement appears only on small scales for haloes with 
low circular velocities (vc < lOOkms""^). The agreement is better for CDM models with 
smaller erg, because this means a lower value of A(i?o) for a given Rq. We therefore believe 
that the difference between the ansatz (26) and the proper probability function given by 
the Monte-Carlo simulation is unlikely to produce a substantial difference in the results; 
we use this ansatz in the remainder of our discussion. 

A similar model can be constructed for the autocorrelation of the mass by using equa- 
tion (7b). Since the mass correlation depends also on the virialization of small structures, 
we will present this model in a separate paper. 

3 COMPARISON WITH NUMERICAL SIMULATIONS 

The critical test of our analytic model comes from a comparison with numerical sim- 
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ulations. Here we compare our model predictions for the two-point correlation functions 
of dark haloes with the results given by numerical simulations. 

Gelb and Bertschinger (1994a, hereafter GB94a) have carried out high resolution N- 
body simulations for CDM models with O = 1, /i = 0.5 and different bias parameters. 
GB94a used two algorithms to identify haloes. The first algorithm (called DENMAX) 
identifies haloes as local density maxima in the smoothed, evolved density field. In the 
second algorithm (the friends-of- friends algarithm, called FOF), haloes are selected from 
the evolved particle positions by identifying all particles within a given linking distance (/) 
of each other. Two linking distances were used, with / = 0.1 and 0.2 times the mean particle 
separation in the simulation. As discussed in detail in GB94a, the DENMAX may have an 
advantage over FOF in its ability to break up large dense clusters into subgroups, while 
still being able to detect smaller, less dense haloes in the field. In Figure 4 the solid, short- 
dashed and long-dashed curves show the average correlation functions of haloes identified 
by DENMAX, FOF(/ = 0.1) and FOF(/ = 0.2), respectively. Resuhs are shown for CDM 
models with erg = 1 (4a) and erg = 0.5 (4b). Each curve shows the correlation function of 
haloes with masses (or circular velocities) larger than some threshold value. This value, 
which depends on the algorithm for halo identification, is chosen so that the number 
density of haloes selected from the simulation is the same as that predicted by equation 
(12) for haloes with Vc greater than a given threshold value Vc (indicated in each panel). 
The number of haloes in the simulation box, which has volume (lOOMpc)^, is indicated by 
iVh. The predictions of our analytic model for the same model parameters are shown as 
crosses. The average correlation function between these haloes are obtained from equation 
(23) by integrating both the denominator and the numerator in the first term on the right 
hand side over vi and V2 (which are related to Mi and M2 by equation 9) from 14 to 00. 
The dotted lines in the figure show the average correlation function which corresponds to 
the differential correlation function i^(r) = [5 h~^M.pc/r]~^'^ observed for normal galaxies. 
The first thing one notices in this figure is the dependence of the correlation function 
on the halo identification algorithm; the simulation with DENMAX gives the strongest 
correlations and FOF(/ = 0.2) the weakest. The effect is quite large on small scales, and 
it is stronger for bigger haloes. Such a dependence is obviously due to the fact that the 
FOF algorithm with large linking length tends to merge haloes in high density regions 
and thereby reduce the weight of such regions (see GB94a for a discussion). The other 
remarkable result shown in the figure is the agreement between the correlation function 
predicted by our model and that based on the DENMAX algorithm. For erg = 1, the 
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agreement is almost perfect. For erg = 0.5, our model gives a higher amplitude on small 
scales for haloes with Vc > 200kms~"^. Given the many assumptions made by our model 
and the uncertainties inherent to the simulation, we regard this agreement to be as good 
as we could hope for. The reasons for this agreement are far from clear, because, as we 
will show in Section 4, the low-mass haloes do not correspond to high peaks of the initial 
density field and our correlation functions differ very substantially from their Lagrangian 
analogues. 

It is interesting to compare our results with those given by Katz, Quinn and Gelb 
(1993). They used the same set of simulations to compare the correlation function of 
particles initially located near peaks in the linear density field with that of the haloes which 
actually form. They found the correlation function of the tagged peaks to be much larger 
than that of the haloes. Similar results were found by Mann, Heavens & Peacock (1993), 
based on an analytic model for the clustering of peaks. That our model does a better job 
suggests that the PS formalism indeed solves, at least partially, the cloud-in-cloud problem. 
Of course, it is unclear that this is any advantage when a comparison is to be made with 
the observed galaxy correlation function, since the haloes which represent galaxy clusters 
should each contain several galaxies, an effect which may be better represented by the 
peak model. 

Figure 5 shows the results for scale-free models with n = —1.5 (5a) and (5b). The 
N-body simulations used here are similar to those in EFWD, but with a larger number of 
particles (10^) and a higher force resolution (L/2500 where L is the side of the compu- 
tational box). The values of the scale factor a given on the figure give the expansion of 
the simulation since its initial condition. The initial power spectrum was normalized as 
described in EFWD. We compare analytic model with simulation in the same manner as 
in Figure 4. The long-dashed and short-dashed curves show the results for haloes selected 
by the FOF algorithm with / = 0.2 and 0.1, respectively. The solid curves show the results 
for haloes selected by a third algorithm which mimics the definition of haloes in the PS 
formalism. In this algorithm, the centres of haloes are determined by the FOF algorithm 
with a small linking length / = 0.1. The mass of each halo is taken to be the total number 
of particles in a bounding sphere within which the mass overdensity is 200. The depen- 
dence of the correlation function on the algorithm of halo identification is similar to that 
shown in Figure 4. For the scale-free model with n = — 1.5, our model prediction (crosses) 
agrees very well with the results based on the third algorithm. For the n = model, our 
model may fail to fit the results for large haloes at separations where ^ 10. At larger 



16 



separations the agreement is again reasonably good. 

To test our formalism for haloes with larger mass, we use the simulation results of 
Bahcall and Cen (1992, BC). EC have carried out large simulations to study the correlation 
functions of clusters of galaxies. Haloes are selected by an adaptive FOF algorithm which 
uses smaller linking lengths for particles in higher density regions. The relation of this 
algorithm to either the original FOF or DENMAX is not obvious. It is, however, plausible 
that this algorithm should be more effective than FOF in resolving haloes in high density 
regions. Motivated by observation, BC considered the correlation length ro, defined to 
be the scale where the two-point correlation function is unity, as a function of the mean 
separation of clusters d (defined by the mean number density of clusters nhy d = n~^/'^). 
Their result for a CDM model with erg = 0.75 is shown in Figure 6 as the dashed curve. For 
comparison, we also show in the figure the observational data, adopted again from BC. To 
compare our model with their simulations, we calculate the mean number of haloes with 
circular velocities Vc > 14, n(vc > 14), by equation (12), and estimate the mean separation 
of the haloes hy d = n~^/^ . To obtain ro, we assume that the average correlation functions 
have a power-law form oc around = 1. The prediction of our formula for the same 
CDM model is shown in Figure 6 as the solid curve. The agreement with the numerical 
simulations is gratifying. 

The strong dependence of the above correlation function on the halo identification 
algorithm may explain why different investigators sometimes get different correlation func- 
tions from similar simulations. 

4 BIAS IN THE DISTRIBUTION OF DARK HALOES 

Having shown that our analytic model gives a reasonable approximation to the corre- 
lation function of the dark haloes identified in N-body simulations, we now use this model 
to investigate how the spatial distributions of dark haloes with different circular velocities 
are related to the mass distribution. This should help us to understand how different kinds 
of object may be used to trace the mass distribution in the real universe. 

4.1 Cross-correlation between haloes and mass 

Figure 7 shows the Lagrangian space average cross-correlation functions between mass 
and haloes of differing circular velocity (equation 18). Since the initial density field has 
been linearly extrapolated to 2; = 0, these cross-correlation functions can be smaller than 
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— 1. Results are shown for CDM models with erg = 1 and erg = 0.5, and for dark haloes with 
circular velocities Vc = 50, 100, 200, 400, 800, 1600 km s~^. For comparison, the average 
mass correlation functions are shown as solid curves. The figure shows that haloes with 
Vc < 800kms~"^ in the erg = 1 model, and with Vc < 400kms~"^ in the erg = 0.5 model, 
are initially anti-correlated with overdensity. Most of these haloes are located in regions 
where the initial overdensity is negative. This anti-correlation with mass is stronger for 
haloes with smaller circular velocity. It occurs because the mass in high density regions 
has already been incorporated into larger mass haloes by 2; = 0. The effect is also stronger 
for the model with erg = 1, because the higher fluctuation amplitude implies substantially 
larger mass for "typical" haloes. The easist way to understand these results is to use an 
argument based on a peak-background split (see e.g. Efstathiou et al. 1988). Consider a 
(large) background region with (extrapolated) overdensity Sq <^ 1 -\- z. In this region the 
threshold for collapse of a small-scale density peak, St = Sc{l -\- z)^ is effectively reduced 
to St — So . The comoving number density of halos (equation 12) in this region will be 
n(VcT z^ S') calculated from equation (12) with Sc replaced by S[. = Sc — ['^o/(l + z)]. The 
average background overdensity per halo is 



It is easy to prove that, if i?o encloses a mass much larger than that of the haloes, then 
equation (18) reduces to equation (30) if we deflne ^^{Ro) = So(Ro). If Sq <^ St = Sc(l-\-z) 
we have to flrst order in Sq 



Substituting in equation (30) shows that haloes with St/ A > 1 (< 1) will be biased towards 
regions with Sq > (< 0) in the initial density fleld. Unbiased (v*) haloes are those for 
which A = St- For the CDM spectrum (1), A{R) = (5c at i? 9 Mpc for erg = 1, and 
at 3 Mpc for erg = 0.5. Thus, according to equation (9), haloes identifled at 2; = and 
with circular velocities Vc < 700kms~"^ (erg = 1), 250kms~"^ (erg = 0.5) are biased toward 
regions with Sq < in the initial density fleld. This is exactly what we have seen in Figure 
7. 

Figure 8 shows the Eulerian cross-correlation functions between haloes and mass given 
by equation (22) with z = 0. Results are shown for the same cases as in Figure 7. However, 
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these Eulerian functions are all positive because of the dynamical evolution of the mass 
density field. Furthermore, the correlation functions for different mass haloes have a similar 
shape. The relative bias in these evolved correlation functions is clearly seen, in that haloes 
with higher Vc are more strongly clustered. The effect is stronger in the model with a lower 
fluctuation amplitude erg. 

4.2 Autocorrelation of dark haloes 

Figure 9 shows the autocorrelation functions of haloes in Lagrangian space at 2; = 
obtained from equation (21). Results are shown for CDM models with erg = 1 and erg = 0.5, 
and for haloes of different circular velocity. Unlike the cross-correlation with mass, the 
amplitudes of the autocorrelation functions in Lagrangian space do not form a monotonic 
sequence according to increasing Vc- The amplitude decreases with increasing Vc until 
Vc ~ SOOkms""^ for erg = 1 {vc ~ 200kms~"^ for erg = 0.5) and then increases with Vc- 
The lowest amplitude is found for haloes whose linear radius Ri (which is related to Vc 
by equation 9) is such that A(i?i) ~ (5c, i.e. for present v* haloes. As we have seen 
in last section, haloes with A(i?i) > Sc are anticorrelated with mass initially. The high 
amplitude of the Lagrangian correlation function for low-mass haloes implies that these 
haloes are concentrated in lower density regions. The high amplitude for haloes with high 
Vc means, however, that they are concentrated in high density regions. For v* haloes, the 
initial clustering is very weak. To see this more clearly, we note that, under the condition 
that Ro ^ Ri and So ^ 1, the correlation function ^{Rq) deflned by equation (21) with 
Ri = R2 and zi = Z2 = z can be written as 

(30) 

For haloes with vi = St/Ai > 1, (^^{Ro) (z/i/Ai)^ A^(i?o), which shows that the 
Lagrangian correlation function of haloes with 1^1 ^ Ai is enhanced with respect to that 
of mass, A^(i?o), by a factor of (i^i/Ai)^. This result is the same as that for high peaks in 
the initial density fleld (see BBKS). For haloes with ui ^ 1, equation (30) gives ^^{Rq) 
A^(i?o)/'5i , which shows that the Lagrangian correlation function of these haloes is lower 
than that of mass by a constant factor. For v* haloes with 1^1 ~ 1, i^;^ ~ 0. These results 
are exactly what were shown in Figure 9. The peak-background split may also help to 
understand the above results (see e.g. Cole 1989). Due to the background modulation, 
the number of halos at redshift z will be changed by a factor n(Vc, 2;, S'^)/n(Vc^ z^ Sc). The 
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ratio between the perturbation in comoving number density of haloes (An/n) and the 
(extrapolated) density perturbation Sq is 



n 



dlnn{Vc^ S) 



1 

It 



A? 



(31) 



which is just the "bias" factor ^/^\\(B^i)/A\Roj . 

Figure 10 shows correlation functions for haloes in Eulerian space (equation 23). The 
results are shown for the same cases as in Figure 9, with each case depicted in the two 
figures by curves with the same line type. The monotonic increase of the correlation 
amplitude with Vc is now clearly seen. The dependence is stronger in the model with 
CTg = 0.5. The shapes of the correlation functions are similar, although those for haloes 
with larger Vc appear to be steeper. Comparing the results shown in both Figure 9 and 
Figure 10, we see clearly that the similarity in the shapes of the correlation functions is a 
result of dynamical evolution rather than of initial conditions. The dynamical evolution of 
the correlation functions is important for all cases except for haloes with Vc = 1600 km s~"^ 
in the model with erg = 0.5, where the correlation function in Eulerian space is almost 
identical to that in Lagrangian space. Thus in "standard CDM" the correlations of rich 
clusters are almost unaffected by dynamical evolution. Figure 10 also shows that haloes 
with Vc > V* are more strongly correlated, and those with Vc < v* are less strongly 
correlated, than mass. This kind of bias in the correlation functions of dark haloes with 
respect to mass and to each other is also due to dynamical evolutions and differs from the 
result for high peaks. For example, the amplitude of the correlation function for haloes 
with Vc = 400 km is higher than that for haloes with Vc = 100 km s~"^ by factors of 2 for 
CTg = 1, and 4 for erg = 0.5, while the factor would be about 30 if we use the amplification 
factor i^^/A^ (with v = Sc/A, see equation 30) predicted for high (v ^ 1) peaks. Thus 
the "dynamical" bias is weaker than that for high peaks. Also, the relative bias, i.e. the 
relative enhancement, of the correlation functions for haloes with different Vc depends on 
the fluctuation amplitude erg, with a lower erg giving a larger relative bias. This differs 
from the high peak case, for which the relative bias is constant. 

So far we have only considered the correlation function as a function of circular velocity 
for haloes identifled at redshift z = 0. It is interesting to see how the present-day correlation 
function of haloes changes when haloes are identifled at higher redshifts. Haloes which 
were identifled at higher redshift may have increased their mass by the present time by 
accreting, or by merging with other haloes. If galaxies formed in the high redshift haloes 
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and these galaxies kept their identity through the subsequent evolution, the correlation 
function we now calculate may be a good model for that of galaxies. In Figure 11 we show 
present-day correlation functions for haloes with Vc = 200kms~"^ which were identified at 
different redshifts z. Results are shown for CDM models with erg = 1 and with erg = 0.5. 
The correlation functions are calculated from equation (23) for zi = Z2 = z^ but with 
the density perturbation (Sq^Rq) evolved to 2; = 0. It is clearly seen that the amplitude 
of the correlation function increases with 2;, and the effect is stronger for erg = 0.5. This 
is a result of the shape of the CDM spectrum on scales i?i ~ 1 Mpc. The rms mass 
fluctuation on these scales can approximately be written as A^(i?i) (x R^'^ with 7 ~ 1 
(which corresponds to a power index n ~ —2). Since for a given Uc, -Ri oc (1 + z)~^/'^ (see 
equation 9), we have oc (1 + zf-/'^ and (^^/A^ oc (1 + zf'/'^ . Using equation (29) we 
see that haloes identifled at higher redshifts are biased toward higher density regions. The 
trend can be reversed if n > 1. 

The results presented here show clearly that the present correlation function of dark 
haloes depends not only on the mass of haloes, but also on the time when they were 
identifled. Indeed, as shown by equation (9), haloes with the same Vc but identifled at 
higher redshift have smaller mass. The results shown in Figure 11 therefore imply that the 
present-day remnants of early low-mass haloes can be more strongly clustered than present- 
day haloes of larger mass. This result is interesting. In a hierarchical clustering scenario, 
such as the CDM models considered here, haloes with high mass formed through the merger 
of low-mass haloes. If the low-mass haloes had formed galaxies with lower mass, and if these 
galaxies had neither merged with other galaxies nor signiflcantly increased their masses 
by accreting the material around them, then these galaxies would be observed today as 
low mass galaxies with strong clustering. A strong mass (or luminosity) segregation in the 
correlation function of galaxies, in the sense that galaxies with higher mass (or luminosity) 
have a stronger correlation function, is not a necessary result of the hierarchical model 
of structure formation. It is also interesting to note that, for both cases, the correlation 
of haloes at 2; 2 with a circular velocity Vc = 200kms~"^ may be as strong as that of 
present-day normal galaxies. A strong 'natural' bias (White et al. 1987) can be present. 

4.3 Bias 

As seen in subsection 4.1, low-mass haloes have initial cross-correlations with mass 
which are negative. Their positive correlations at 2; = may be due to the motion of 
both haloes and mass into overdense regions. Thus, a region with 8q < Q and Rq > R can 
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be in an overdense region with a Eulerian radius i?, if it is contained in a larger region 
with So > and Rq = i?o(-R, f^o), while a region with Sq > and Rq > R can never be 
in a underdense region with Eulerian radius R. As a result, most of the mass can have 
moved to overdense regions, although half of it was in underdense regions initially. The 
same effect could apply to low-mass dark haloes, so that they do not reside in voids at 
present time, even though they did initially. On the other hand, the positive correlation 
of low-mass haloes with mass could due primarily to strong evolution of the mass density 
field around those haloes which reside in high density regions initially, rather than to the 
evacuation of underdense regions; in this case low-mass haloes would still be biased toward 
underdense regions. To distinguish between these possibilities, we define a function 

Ar(l|0) 

•^h = 77F - 1, (32) 

n(vi,zi)V 

for type-1 haloes and for a spherical region with Eulerian radius R (corresponding to a 
volume V) and with present-day mass overdensity S^a = {Ro / RY ~ This function gives 
present-day overdensity of type-1 haloes in such a region. In Figure 12a, we show S]^ as 
a function of Sj^ for R = 20 Mpc and for haloes (identified at 2; = 0) with different Vc- 
It is clearly seen that present-day haloes with Vc < v* are still biased toward underdense 
regions. 

Figure 12b shows the same thing for v* haloes at varying identification redshift z. 
We see from the figure that at present day all these haloes have correlations comparable 
to those of the mass. These means that if each v* halo formed a galaxy at high redshift 
and this galaxy kept its identity through the subsequent evolution, then the distribution 
of galaxies would not be biased with respect to the mass. A positive bias can be obtained 
if the majority of normal galaxies form in haloes with Vc > v*. In the "standard CDM" 
model with erg = 0.5, the circular velocity of v* haloes at 2; ^ 1 is ^ lOOkms""^. Thus, 
if normal galaxies form only in dark haloes identified at 2; > 1 but with circular velocity 
Vc ^ 200kms~"^ (typically that of normal galaxies), then the distribution of normal galaxies 
should be positively biased with respect to the mass, as shown in Figure lib. The above 
results suggest that a "natural" bias can be present as a result of constraints on the 
formation epoch of the dark haloes of normal galaxies. Present-day haloes with Vc ~ 
200kms~"^ may form a different population of galaxies, for example low-surface-brightness 
galaxies. These galaxies should then have much weaker (by a factor of 2 to 3) correlations 
than normal galaxies (Mo, McGaugh & Bothun 1994). 
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5 CONCLUSION 



In this paper we have developed an analytic formalism for calculating autocorrelation 
and cross-correlation functions both for dark haloes and for mass. We have tested the 
results against various N-body simulations. The correlation functions of haloes in the N- 
body simulations depend strongly on the algorithm by which the haloes are selected. Our 
formalism agrees reasonably well with results based on algorithms which have stronger 
ability to break up large clusters than the standard FOF algorithm with a large linking 
length. We have demonstrated that our formalism can help us to understand how the 
distributions of different kinds of object are related to that of the mass. Although our 
discussions are mainly based on CDM models with = 1 and A = 0, the formalism we 
have developed can easily be extended to other (gaussian) models of structure formation. 
Also the model can be extended to situations where dark haloes are most appropriately 
defined in a different way. For example, a similar model can be developed for local maxima 
in the initial density field. One could also hope to develop similar models for higher order 
correlation functions, or to develop more realistic models in which the collapse of density 
perturbations is aspherical. 

As pointed out above, it is not straightforward to apply our model to the clustering 
of galaxies, because a dark halo may contain more than one galaxy, or may not contain 
galaxies at all. However, if more detailed modelling allows a prediction of the number 
of galaxies in a halo as a function of the properties of the galaxies and of the halo (see 
for example, Kauffmann, White & Guiderdoni 1993; Kauffmann, Guiderdoni & White 
1994; Cole et al. 1994), the model developed here can readily be extended to study the 
correlations of galaxies with respect to luminosity, morphological type, colour, or any other 
property of interest. 

APPENDIX 

For readers' convenience, we summarize in the following the main formulae that are 
needed to calculate the various correlation functions we have discussed. We point out again 
that these formulae are specific to an Einstein-de Sitter universe. A number of relatively 
minor changes have to be made in order to apply them to other cosmological models. 

For a given spectrum of initial density fluctuation, the rms mass fluctuation in a 
window with comoving radius Rq (in current units), A(i?o), is given by equation (4). The 
mass M and circular velocity Vc of a halo identifled at redshift z are related to its linear 
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scale Ri by equation (9). The comoving halo number density n(uc, z)^ expressed in current 
units, as a function of Vc (or and z is then given by equation (12). 

The cross- correlation between dark haloes and mass in Lagrangian space is given by 
equation (18), with q(0) defined by equation (13) and A/'(1|0) by equation (15). 

The cross- correlation between two types of haloes in Lagrangian space is given by 
equation (21), with A/'(1|0; 1) defined by equation (20). 

The cross- correlation between dark haloes and mass in Eulerian space is given by 
equation (22), with pE defined by equation (25). The function p(So\R) in equation (25) 
is defined by equation (26b). The evolution of the correlation function is treated by a 
spherical model which relates the Lagrangian radius Rq of a spherical region to its Eulerian 
radius R and linear overdensity Sq (equation 8). 

The cross- correlation between two types of haloes in Eulerian space is given by equation 
(23). If the initial density perturbations (described by mass shells ((5o, Ro)) are evolved to 
present time (or any other time), this formula gives the present cross-correlation between 
the locations of haloes identified at zi with those identified at Z2 ■ 
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Figure captions 



Figure 1. The three sohd curves show the evolution of mass sheUs (defineded by their 
Lagrangian radius Rq or their mass M and by the mean mass overdensity Sq within them) 
before they coUapse, i.e. for Sq < 8c, where 8c = 1.68 is shown as the upper horizontal 
dotted line. These mass shells evolve to have Eulerian radii i? = 1, 10, or 100 Mpc. These 
values are shown as the dotted vertical lines. The other three curves show three different 
random walks representing possible overdensity trajectories that end up predicting halo at 
point "H" in the i?o - f^o space (see the text for details). 

Figure 2. The probability functions p{8q\R) given by our ansatz (26) (solid curves) and 
by a variant of it based on equation (13) (dashed curves) are compared to those obtained 
from Monte Carlo trajectories as described in the text (dotted curves). The results shown 
are for a CDM spectrum and for two values of the Eulerian radius, i? = 10 and 2 Mpc. 

Figure 3. Eulerian average autocorrelation functions for haloes calculated from equation 
(23) with p{8q\R) given by model (26) (curves) and by a model based on equation (13) 
(crosses). The results are shown for a CDM model with erg = 1, and for haloes with 
different circular velocities Vc- 

Figure 4. Eulerian average autocorrelation functions for haloes with different circular 
velocities in CDM models with erg = 1 (4a) and erg = 0.5 (4b) taken from the numerical 
simulations of Gelb and Bertschinger (1993a). The solid, short-dashed and long-dashed 
curves show the resuhs for haloes selected by DENMAX, FOF(/ = 0.1) and FOF(/ = 0.2) 
respectively. The predictions of our analytic model (equation 23) for the same model 
parameters are shown as crosses. The dotted lines in the figure show the average correlation 
function which corresponds to the differential correlation function i^(r) = [5 /i~"^Mpc/r]~"^'^ 
observed for normal galaxies. The value of iVh in each panel gives the number of haloes in 
the simulation box. 

Figure 5. Eulerian average autocorrelation functions for haloes in scale-free models with 
n = —1.5 (5a) and n = (5b). The short-dashed and long-dashed curves show the results 
for haloes selected by FOF(/ = 0.1) and FOF(/ = 0.2) respectively. The solid curve shows 
the results for haloes selected by an algorithm which mimics the definition of haloes in the 
Press-Schechter formalism (see text). The predictions of our analytic model (equation 23) 
for the same model parameters are shown as crosses. The value of iVh in each panel gives 
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the number of haloes in the simulation box. 



Figure 6. The correlation length ro, defined as the scale where the two-point correlation 
function is unity, as a function of the mean separation of clusters d (defined from the mean 
number density of clusters n hj d = n~^^^ ). The dashed curve shows the result of Bahcall 
and Cen (1992) for a CDM model with erg = 0.75. The solid curve shows the result of our 
analytic model for the same case. For comparison, we also show the observational data, as 
reported by Bahcall and Cen (1992). 

Figure 7. Lagrangian average cross-correlations between haloes and mass as given by 
equation (18). Since the initial density field has been linearly extrapolated to 2; = 0, 
these cross-correlations can be smaller than —1. Results are shown for CDM models with 
CTg = 1 (7a) and erg = 0.5 (7b), and for dark haloes with circular velocity of Vc = 50 
(dotted), 100 (short-dashed), 200 (long-dashed), 400 (dotted-short-dashed), 800 (dotted- 
long-dashed), and 1600 km s~"^ (short-dashed-long-dashed curve). For comparison, average 
mass autocorrelation functions are shown as solid curves. 

Figure 8. Eulerian average cross-correlations between haloes and mass as given by equa- 
tion (22). Results are shown for CDM models with erg = 1 (8a) and erg = 0.5 (8b), and for 
dark haloes with circular velocities Vc = 50 (dotted), 100 (short-dashed), 200 (long-dashed), 
400 (dotted-short-dashed), 800 (dotted-long-dashed), and 1600 km s~"^ (short-dashed-long- 
dashed curve), respectively. Average mass autocorrelation functions are shown as solid 
curves. 

Figure 9. Lagrangian average autocorrelation functions of haloes calculated from equation 
(21). Results are shown for CDM models with erg = 1 (9a) and erg = 0.5 (9b), and for dark 
haloes with circular velocities Vc = 50 (dotted), 100 (short-dashed), 200 (long-dashed), 
400 (dotted-short-dashed), 800 (dotted-long-dashed), and 1600 km s~"^ (short-dashed-long- 
dashed curve). Average mass autocorrelation functions are shown as solid curves. 

Figure 10. Eulerian average autocorrelation functions of haloes given by equation (23). 
Results are shown for CDM models with erg = 1 (10a) and erg = 0.5 (10b), and for dark 
haloes with circular velocities Vc = 50 (dotted), 100 (short-dashed), 200 (long-dashed), 
400 (dotted-short-dashed), 800 (dotted-long-dashed), and 1600 km s~"^ (short-dashed-long- 
dashed curve). Average mass autocorrelation functions are shown as solid curves. 
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Figure 11. Present-day average autocorrelation functions for haloes with the same circular 
velocity Vc = 200kms~"^, but identified at different redshifts. These correlation functions 
are calculated from equation (23) for zi = Z2 = but with the density perturbation 
(Sq^Rq) evolved to 2; = 0. Results are shown for CDM models with erg = 1 (11a) and 
CTg = 0.5 (lib), and for identification redshift z = (dotted), 1 (short-dashed), 2 (long- 
dashed), 3 (dotted-short-dashed), and 4 (dotted-long-dashed). The present-day average 
mass autocorrelation functions are shown as solid curves. 

Figure 12. Halo overdensity S]^ as a function of mass overdensity Sj^ (equation 32) within 
a sphere with Eulerian radius R = 20 Mpc, (a) for haloes identified at 2; = and with 
Vc = 50, 100, 200, 400, 800, and 1600 km s'^ (from flat to steep curves), and (b) for v* 
haloes at 2; = 0, 1, 2, 3, and 4 (from the most curved to the least curved curves). Solid 
curves show results for erg = 1; dashed ones show those for erg = 0.5. The dotted lines 

show (5h = <^m- 
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